Complex event recognition using constrained low-rank representation

نویسندگان

  • Afshin Dehghan
  • Omar Oreifej
  • Mubarak Shah
چکیده

a r t i c l e i n f o Complex event recognition is the problem of recognizing events in long and unconstrained videos. In this extremely challenging task, concepts have recently shown a promising direction where core low-level events (referred to as concepts) are annotated and modeled using a portion of the training data, then each complex event is described using concept scores, which are features representing the occurrence confidence for the concepts in the event. However, because of the complex nature of the videos, both the concept models and the corresponding concept scores are significantly noisy. In order to address this problem, we propose a novel low-rank formulation, which combines the precisely annotated videos used to train the concepts, with the rich concept scores. Our approach finds a new representation for each event, which is not only low-rank, but also constrained to adhere to the concept annotation, thus suppressing the noise, and maintaining a consistent occurrence of the concepts in each event. Extensive experiments on large scale real world dataset TRECVID Multimedia Event Detection 2011 and 2012 demonstrate that our approach consistently improves the discriminativity of the concept scores by a significant margin. The increasing popularity of digital cameras has been creating a tremendous growth in social media websites like YouTube. Along with the increased number of user-uploaded videos, the need to automatically detect and recognize the type of activities occurring in these videos has become crucial. However, in such unconstrained videos, automatic content understanding is a very challenging task due to the large intra-class variation, dynamic and heterogeneous background, and different capturing conditions. Therefore, this problem has recently gained significant attention. Most activity recognition methods are developed for constrained and short videos (5–10 s) as in [3–6]. These videos contain simple and well-defined human actions such as waving, running, and jumping. In contrast, in this paper, we consider more practical videos with realistic events, complicated contents, and significantly variable lengths. Refer to Fig. 2. The standard activity recognition methods do not incorporate evidences for detecting a particular action/event when deal with such unconstrained videos. To this end, the most recent approaches resorted to using low-level events called " concepts " as an intermediate representation [2,1]. In that, a complex event is described using concept scores, which is the occurrence confidence for the concepts in the video. For example, the event Birthday Party can be described as …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Human Action Attribute Learning Using Low-Rank Representations

This paper studies the problem of learning human action attributes based on union-of-subspaces model. It puts forth an extension of the low-rank representation (LRR) model, termed the hierarchical clustering-aware structure-constrained low-rank representation (HCSLRR) model, for unsupervised learning of human action attributes from video data. The effectiveness of the proposed model is demonstr...

متن کامل

Human Action Attribute Learning From Video Data Using Low-Rank Representations

Representation of human actions as a sequence of human body movements or action attributes enables the development of models for human activity recognition and summarization. We present an extension of the low-rank representation (LRR) model, termed the clustering-aware structure-constrained low-rank representation (CS-LRR) model, for unsupervised learning of human action attributes from video ...

متن کامل

Multi-view low-rank sparse subspace clustering

Most existing approaches address multi-view subspace clustering problem by constructing the affinity matrix on each view separately and afterwards propose how to extend spectral clustering algorithm to handle multi-view data. This paper presents an approach to multi-view subspace clustering that learns a joint subspace representation by constructing affinity matrix shared among all views. Relyi...

متن کامل

Sequence Summarization Using Order-constrained Kernelized Feature Subspaces

Representations that can compactly and effectively capture temporal evolution of semantic content are important to machine learning algorithms that operate on multi-variate time-series data. We investigate such representations motivated by the task of human action recognition. Here each data instance is encoded by a multivariate feature (such as via a deep CNN) where action dynamics are charact...

متن کامل

Fast Low-rank Representation based Spatial Pyramid Matching for Image Classification

Recently, Spatial Pyramid Matching (SPM) with nonlinear coding strategies, e.g., sparse code based SPM (ScSPM) and locality-constrained linear coding (LLC), have achieved a lot of success in image classification. Although these methods achieve a higher recognition rate and take less time for classification than the traditional SPM, they consume more time to encode each local descriptor extracte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Image Vision Comput.

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2015